TreesΒΆ
(Courtesy: By Paddy3118 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=83223854)
TreeΒΆ
A tree structure consists of nodes and edges in a hierarchical fashion.
relationships similar to those of a family tree:
- βchild,β βparent,β βancestor,β etc.
- The data elements are stored in nodes and pairs of nodes are connected by edges.
resembles an inverted real-life tree
- top node is called the root.
TreesΒΆ
TreesΒΆ
Binary TreesΒΆ
A binary tree is a tree in which each node can have at most two children.
One child is identified as the left child and the other as the right child.
Binary TreesΒΆ
Binary TreesΒΆ
Organized into levels with the root node at level 0,
- its children at level 1, the children of level one nodes are at level 2, and so on.
The depth of a node is its distance from the root.
- $G$ has a depth of 2, 3, and 6 in the figure above.
The height of a binary tree is the number of levels in the tree.
- heights: 4, 6, 8.
The width of a binary tree is the number of nodes on the level containing the most nodes.
- widths: 4, 3, 1
The size of a binary tree is simply the number of nodes in the tree.
Binary TreesΒΆ
Binary TreesΒΆ
Maximum height: $n$ (the number of levels in the tree)
What about minimum height?
- Each level $i$ can have at most $2^i$ nodes
If each level is completely filled:
- $n = 1$
- $level\ 0 = 2^0$
- $height = \log_2(1) + 1 = 1$
- $n = 3$
- $level\ 0 = 2^0, level\ 1 = 2^1$
- $height = \log_2(3) + 1 = 2$
- $n = 2$
- $level\ 0 = 2^0, level\ 1 = 1\ (\leq 2^1)$
- $height = \lfloor\log_2(2)\rfloor + 1 = 2$
- $n = 1$
Minimum height: $\lfloor\log_2(n)\rfloor + 1$
Full Binary TreeΒΆ
- A full binary tree is a binary tree in which each interior node contains two children.
- Or, in other words, each node contains zero or 2 children
Complete Binary TreeΒΆ
- A perfect binary tree till height $(h β 1)$ and
- the nodes on the lowest level fill the available slots from left to right leaving no gaps.
# The storage class for creating binary tree nodes.
class BinTreeNode :
def __init__( self, data ):
self.data = data
self.left = None
self.right = None
Tree TraversalsΒΆ
Operations that can be performed on a binary tree depend on the application
Tree Traversal is one of the operations
A traversal iterates through a collection, one item at a time, in order to "visit" (access) each item.
Tree TraversalsΒΆ
With a linear structure such as a linked list/array, the traversal is rather easy.
- Start with the first node and iterate through the nodes, by following the links/next location.
How do we "visit" all the nodes in a binary tree?
- If we simply follow the links, we won't be able to visit all.
Preorder Tree TraversalΒΆ
One way is for a tree traversal to begin with the root node.
After visiting the root node, we can then traverse the nodes in its left subtree followed by the nodes in its right subtree.
def preorderTrav( subtree ):
if subtree is not None :
print( subtree.data, end=" " )
print("going left of",subtree.data)
preorderTrav( subtree.left )
print("going right of",subtree.data)
preorderTrav( subtree.right )
# The storage class for creating binary tree nodes.
class BinTreeNode :
def __init__( self, data):
self.data = data
self.left = None
self.right = None
def insertL( self, lnode ):
self.left = lnode
def insertR( self, rnode ):
self.right = rnode
def PrintTree(self):
print( self.data),
if self.left:
self.left.PrintTree()
if self.right:
self.right.PrintTree()
A = BinTreeNode('A')
B = BinTreeNode('B')
C = BinTreeNode('C')
D = BinTreeNode('D')
A.insertL(B)
A.insertR(C)
B.insertL(D)
A.PrintTree()
A B D C
A = BinTreeNode('A')
B = BinTreeNode('B')
C = BinTreeNode('C')
D = BinTreeNode('D')
E = BinTreeNode('E')
F = BinTreeNode('F')
G = BinTreeNode('G')
H = BinTreeNode('H')
I = BinTreeNode('I')
J = BinTreeNode('J')
A.insertL(B)
A.insertR(C)
B.insertL(D)
B.insertR(E)
E.insertL(H)
C.insertL(F)
C.insertR(G)
G.insertL(I)
G.insertR(J)
preorderTrav(A)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-6-30a46bbde68b> in <module> 20 G.insertR(J) 21 ---> 22 preorderTrav(A) 23 24 NameError: name 'preorderTrav' is not defined
def inorderTrav( subtree ):
if subtree is not None :
print('going left of',subtree.data)
inorderTrav( subtree.left )
print( 'centre',subtree.data )
print('going right of',subtree.data)
inorderTrav( subtree.right )
inorderTrav(A)
going left of A going left of B going left of D centre D going right of D centre B going right of B going left of E going left of H centre H going right of H centre E going right of E centre A going right of A going left of C going left of F centre F going right of F centre C going right of C going left of G going left of I centre I going right of I centre G going right of G going left of J centre J going right of J
def postorderTrav( subtree ):
if subtree is not None :
postorderTrav( subtree.left )
postorderTrav( subtree.right )
print( subtree.data, end=" " )
postorderTrav(A)
D H E B F I J G C A
Binary Tree TraversalΒΆ
Preorder, Inorder, Postorder are examples of Depth-First Search
- Solving mazes
- whether a network is connected or not?
We can also traverse a tree using Breadth-First Search
Breadth First Search TraversalΒΆ
- For example, a chess engine may build the game tree from the current position by applying all possible moves, and use breadth-first search to find a win position.
Breadth-First SearchΒΆ
- We cannot use recursion, since we are exploring level-by-level
- After visiting A's children, B and C, we need to visit B's children, then C's children, then B's grandchildren and then C's grandchildren, and so on...
# Implementation of the Queue ADT using a Python list.
class Queue :
# Creates an empty queue.
def __init__( self ):
self._qList = list()
# Returns True if the queue is empty.
def isEmpty( self ):
return len( self ) == 0
# Returns the number of items in the queue.
def __len__( self ):
return len( self._qList )
# Adds the given item to the queue.
def enqueue( self, item ):
self._qList.append( item )
# Removes and returns the first item in the queue.
def dequeue( self ):
assert not self.isEmpty(), "Cannot dequeue from an empty queue."
return self._qList.pop( 0 )
def breadthFirstTrav( bintree ):
# Create a queue and add the root node to it.
q = Queue()
q.enqueue( bintree )
# Visit each node in the tree.
while not q.isEmpty() :
# Remove the next node from the queue and visit it.
node = q.dequeue()
print( node.data, end=" " )
# Add the two children to the queue.
if node.left is not None :
q.enqueue( node.left )
if node.right is not None :
q.enqueue( node.right )
breadthFirstTrav(A)
A B C D E F G H I J
HeapsΒΆ
A heap is a complete binary tree.
Nodes are organized based on their data entry values
Heap (order) property
- max heap: Each non-leaf node has a value greater than its children
- min heap: Each non-leaf node has a value smaller than its children
HeapΒΆ
Data Structure with limited operations
${\tt insert(\ )}$ a new value into a heap
${\tt extract(\ )}$ a value from the top of the heap
Heap InsertionΒΆ
- When a new value is inserted,
- heap property (max-heap/min-heap) should be maintained, and
- it should remain a complete binary tree (heap!)
From a heap property perspective,ΒΆ
From a heap perspective, maintaining the shape of a complete binary tree,ΒΆ
- Recall: A Complete Binary Tree is
- A perfect binary tree till height $(h β 1)$ and
- the nodes on the lowest level fill the available slots from left to right leaving no gaps.
- A perfect binary tree till height $(h β 1)$ and
To restore the heap order property,ΒΆ
Let's add 41 now...ΒΆ
Heap ExtractionΒΆ
Only the root note of the heap
largest in case of max-heap
smallest in case of min-heap
Heap ImplementationΒΆ
Heap is a binary tree
Implement a heap using an array
Heap ImplementationΒΆ
Since a heap is a complete binary tree,
- no holes due to missing internal nodes.
Makes it easy to quickly locate:
- the parent of any node, and
- the left and right child of any node.
Given the array index $i$ of a node,
- parent = $(i - 1)\ //\ 2$
- left child = $ 2\ast i + 1$
- right child = $ 2\ast i + 2$
Heap ImplementationΒΆ
How to check if a nodeβs child link is null?
- check if the index is out of range! :-)
Does 23 have a right child?
- If it did, it would be at: 4 * 2 + 2 = 10
# An array-based implementation of the max-heap.
class MaxHeap :
# Create a max-heap with maximum capacity of maxSize.
def __init__( self, maxSize ):
self._elements = [-1]* maxSize # changed for list
self._count = 0
# Return the number of items in the heap.
def __len__( self ):
return self._count
# Return the maximum capacity of the heap.
def capacity( self ):
return len( self._elements )
# Add a new value to the heap.
def add( self, value ):
assert self._count < self.capacity(), "Cannot add to a full heap."
# Add the new value to the end of the list.
self._elements[ self._count ] = value
self._count += 1
# Sift the new value up the tree.
self._siftUp( self._count - 1 )
# Extract the maximum value from the heap.
def extract( self ):
assert self._count > 0, "Cannot extract from an empty heap."
# Save the root value and copy the last heap value to the root.
value = self._elements[0]
self._count -= 1
self._elements[0] = self._elements[ self._count ]
# Sift the root value down the tree.
self._siftDown( 0 )
return value
# Sift the value at the ndx element up the tree.
def _siftUp( self, ndx ):
if ndx == 0: # changed to check if ndx is root, then nothing to be done
return
if ndx > 0 :
parent = (ndx - 1) // 2 # changed (ndx - 1) not ndx
if self._elements[ndx] > self._elements[parent] :
# swap elements
print("(sift up) swapping",self._elements[ndx],self._elements[parent])
tmp = self._elements[ndx]
self._elements[ndx] = self._elements[parent]
self._elements[parent] = tmp
self._siftUp( parent )
# Sift the value at the ndx element down the tree.
def _siftDown( self, ndx ):
left = 2 * ndx + 1
right = 2 * ndx + 2
# Determine which node contains the larger value.
largest = ndx
if left < self._count and self._elements[left] >= self._elements[largest] :
largest = left
if right < self._count and self._elements[right] >= self._elements[largest]: # changed elif to if
largest = right
# If the largest value is not in the current node (ndx), swap it with
# the largest value and repeat the process.
if largest != ndx :
# swap elements
print("(sift down) swapping",self._elements[ndx],self._elements[largest])
tmp = self._elements[ndx]
self._elements[ndx] = self._elements[largest]
self._elements[largest] = tmp
self._siftDown( largest )
# Print the heap, level by level
def prnheap (self):
cnt = 0
j = 1
for i in range(self._count):
print(self._elements[i], end=" ")
if (i == j-1):
cnt += 1
j = j + 2**cnt
print("")
print("")
myheap = MaxHeap(15)
myheap.add(100)
myheap.add(84)
myheap.add(71)
myheap.add(60)
myheap.add(23)
myheap.add(12)
myheap.add(29)
myheap.add(1)
myheap.add(37)
myheap.add(4)
myheap.prnheap()
100 84 71 60 23 12 29 1 37 4
myheap.add(90)
myheap.prnheap()
(sift up) swapping 90 23 (sift up) swapping 90 84 100 90 71 60 84 12 29 1 37 4 23
myheap.add(41)
myheap.prnheap()
(sift up) swapping 41 12 100 90 71 60 84 41 29 1 37 4 23 12
Heap ExtractionΒΆ
myheap.extract()
myheap.prnheap()
(sift down) swapping 12 90 (sift down) swapping 12 84 (sift down) swapping 12 23 90 84 71 60 23 41 29 1 37 4 12
Time Complexity?ΒΆ
insert ( )
extract ( )
AnalysisΒΆ
insert into a heap takes $O(\log n)$ time.
extracting the root of the heap also takes $O(\log n)$ time.
Priority Queues can be implemented as a Heap!ΒΆ
Priority Queues as Min-heapΒΆ
HeapsortΒΆ
def simpleHeapSort( theSeq ):
# Create an array-based max-heap.
n = len(theSeq)
heap = MaxHeap( n )
# Build a max-heap from the list of values.
for item in theSeq :
heap.add( item )
# Extract each value from the heap and store them back into the list.
for i in range( n-1, -1, -1 ) :
theSeq[i] = heap.extract()
myseq = [ 8, 2, 6, 14, 5, 1]
mysorted = simpleHeapSort(myseq)
print(myseq)
(sift up) swapping 14 2 (sift up) swapping 14 8 (sift down) swapping 1 8 (sift down) swapping 1 5 (sift down) swapping 1 6 (sift down) swapping 2 5 (sift down) swapping 1 2 [1, 2, 5, 6, 8, 14]
Heapsort Time ComplexityΒΆ
Heap Construction
- inserting each element takes $O(\log n)$ time
- for $n$ items, total time taken: $O(n\log n)$
Extraction
- each extraction takes $O(\log n)$ time.
- total time for $n$ items: $O(n\log n)$
Heapsort in-placeΒΆ
Previous implementation requires the use of additional storage to build the heap structure.
It is possible to do it in place.
- The heap items in front, and the yet to be added items at the back.
Insertion Sort ΒΆ
A collection of sorted items and a collection of items yet to be sorted.
Both the sorted and unsorted collections within the same sequence structure.
List of sorted values at the front of the sequence and picks the next unsorted value from the first of those yet to be positioned.
- [4, 8, 6, 2]
- [4 | 8, 6, 2 ] - First item is always sorted.
- [4, 8 | 6 ,2 ] - 4 and 8 are in their right positions.
- [4, 6, 8 | 2 ] - 4, 6 and 8 are sorted
- [2, 4, 6, 8 ] - Fully sorted
ExtractionΒΆ
# Sift the value at the ndx element up the tree.
def siftUp( seq, ndx ):
if ndx == 0:
return
if ndx > 0 :
parent = ndx // 2
if seq[ndx] > seq[parent] : # swap elements
tmp = seq[ndx]
seq[ndx] = seq[parent]
seq[parent] = tmp
siftUp( seq, parent )
# Sift the value at the ndx element down the tree.
def siftDown( seq, end, ndx ) :
count = end - ndx
left = 2 * ndx + 1
right = 2 * ndx + 2
# Determine which node contains the larger value.
largest = ndx
if left < count and seq[left] >= seq[largest] :
largest = left
if right < count and seq[right] >= seq[largest]:
largest = right
# If the largest value is not in the current node (ndx), swap it with
# the largest value and repeat the process.
if largest != ndx :
tmp = seq[ndx]
seq[ndx] = seq[largest]
seq[largest] = tmp
siftDown( seq, end, largest )
# Sorts a sequence in ascending order using the heapsort.
def heapsort( theSeq ):
n = len(theSeq)
# Build a max-heap within the same array.
for i in range( n ) :
siftUp( theSeq, i )
# Extract each value and rebuild the heap.
for j in range( n-1, -1, -1) :
tmp = theSeq[j]
theSeq[j] = theSeq[0]
theSeq[0] = tmp
siftDown( theSeq, j, 0 )
myseq = [ 8, 2, 6, 14, 5, 1]
mysorted = heapsort(myseq)
print(myseq)